Constraint Score: A new filter method for feature selection with pairwise constraints

نویسندگان

  • Daoqiang Zhang
  • Songcan Chen
  • Zhi-Hua Zhou
چکیده

Feature selection is an important preprocessing step in mining high-dimensional data. Generally, supervised feature selection methods with supervision information are superior to unsupervised ones without supervision information. In the literature, nearly all existing supervised feature selection methods use class labels as supervision information. In this paper, we propose to use another form of supervision information for feature selection, i.e. pairwise constraints, which specifies whether a pair of data samples belong to the same class (must-link constraints) or different classes (cannot-link constraints). Pairwise constraints arise naturally in many tasks and are more practical and inexpensive than class labels. This topic has not yet been addressed in feature selection research. We call our pairwise constraints guided feature selection algorithm as Constraint Score and compare it with the well-known Fisher Score and Laplacian Score algorithms. Experiments are carried out on several high-dimensional UCI and face data sets. Experimental results show that, with very few pairwise constraints, Constraint Score achieves similar or even higher performance than Fisher Score with full class labels on the whole training data, and significantly outperforms Laplacian Score.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

متن کامل

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...

متن کامل

Selection of Favorite Reusable Launch Vehicle Concepts by using the Method of Pairwise Comparison

The contribution of this paper to the space transportation system field is to select promising Reusable Launch Vehicle (RLV) concepts by using a formal evaluation procedure. The vehicle system is divided into design features. Every design feature can have alternative characteristics. All combinations of design features and characteristics are compared pairwise with each other. The innovation an...

متن کامل

Wised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge

The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2008